Global Landslides Evaluation & Visualization

Import Libraries

In [ ]:
# import libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

import folium
from folium import plugins
from folium import Marker
from folium.plugins import MarkerCluster, HeatMap

import math
import warnings
warnings.filterwarnings("ignore")

Getting Know About The Dataset

In [ ]:
# read dataset to pandas dataframe

df = pd.read_csv('/content/drive/MyDrive/Colab Materials/Land Slide Datset NASA/Global_Landslide_Catalog_Export.csv')
In [ ]:
# display first 5 rows of the dataset

df.head()
Out[ ]:
source_name source_link event_id event_date event_time event_title event_description location_description location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name photo_link notes event_import_source event_import_id country_name country_code admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance submitted_date created_date last_edited_date longitude latitude
0 AGU https://blogs.agu.org/landslideblog/2008/10/14... 684 08/01/2008 12:00:00 AM NaN Sigou Village, Loufan County, Shanxi Province occurred early in morning, 11 villagers buried... Sigou Village, Loufan County, Shanxi Province unknown landslide rain large mine 11.0 NaN NaN NaN NaN glc 684.0 China CN Shaanxi 0.0 Jingyang 41.02145 04/01/2014 12:00:00 AM 11/20/2017 03:17:00 PM 02/15/2018 03:51:00 PM 107.4500 32.5625
1 Oregonian http://www.oregonlive.com/news/index.ssf/2009/... 956 01/02/2009 02:00:00 AM NaN Lake Oswego, Oregon Hours of heavy rain are to blame for an overni... Lake Oswego, Oregon 5km mudslide downpour small unknown 0.0 NaN NaN NaN NaN glc 956.0 United States US Oregon 36619.0 Lake Oswego 0.60342 04/01/2014 12:00:00 AM 11/20/2017 03:17:00 PM 02/15/2018 03:51:00 PM -122.6630 45.4200
2 CBS News https://www.cbsnews.com/news/dozens-missing-af... 973 01/19/2007 12:00:00 AM NaN San Ramon district, 195 miles northeast of the... (CBS/AP) At least 10 people died and as many a... San Ramon district, 195 miles northeast of the... 10km landslide downpour large unknown 10.0 NaN NaN NaN NaN glc 973.0 Peru PE Junín 14708.0 San Ramón 0.85548 04/01/2014 12:00:00 AM 11/20/2017 03:17:00 PM 02/15/2018 03:51:00 PM -75.3587 -11.1295
3 Reuters https://in.reuters.com/article/idINIndia-41450... 1067 07/31/2009 12:00:00 AM NaN Dailekh district One person was killed in Dailekh district, pol... Dailekh district unknown landslide monsoon medium unknown 1.0 NaN NaN NaN NaN glc 1067.0 Nepal NP Mid Western 20908.0 Dailekh 0.75395 04/01/2014 12:00:00 AM 11/20/2017 03:17:00 PM 02/15/2018 03:51:00 PM 81.7080 28.8378
4 The Freeman http://www.philstar.com/cebu-news/621414/lands... 2603 10/16/2010 12:00:00 PM NaN sitio Bakilid in barangay Lahug Another landslide in sitio Bakilid in barangay... sitio Bakilid in barangay Lahug 5km landslide tropical_cyclone medium unknown 0.0 NaN Supertyphoon Juan (Megi) NaN NaN glc 2603.0 Philippines PH Central Visayas 798634.0 Cebu City 2.02204 04/01/2014 12:00:00 AM 11/20/2017 03:17:00 PM 02/15/2018 03:51:00 PM 123.8978 10.3336
In [ ]:
# display last 5 rows

df.tail()
Out[ ]:
source_name source_link event_id event_date event_time event_title event_description location_description location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name photo_link notes event_import_source event_import_id country_name country_code admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance submitted_date created_date last_edited_date longitude latitude
11028 The Jakarta Post http://www.thejakartapost.com/news/2017/04/02/... 11109 04/01/2017 01:34:00 PM NaN Major landslide in Banaran Landslide exacerbated by deforestation and bad... Banaran, Ponorogo, Jawa Timur, Indonesia 5km landslide rain medium natural_slope 27.0 0.0 NaN http://img.jakpost.net/c/2017/04/02/2017_04_02... NaN NaN NaN NaN NaN NaN NaN NaN NaN 07/28/2017 01:34:00 PM 12/19/2017 09:42:00 PM 02/15/2018 03:51:00 PM 111.679944 -7.853409
11029 Greater Kashmir http://www.greaterkashmir.com/news/jammu/lands... 10845 03/25/2017 05:32:00 PM NaN Barnari Sigdi Landslide Two teenage girls died after they were buried ... Barnari Sigdi area, Tehsil Mughalmaidan, Kisht... 5km landslide other small natural_slope 2.0 0.0 NaN NaN Nallah is a steep narrow valley. NaN NaN NaN NaN NaN NaN NaN NaN 09/21/2017 05:32:00 PM 12/05/2017 06:45:00 PM 02/15/2018 03:51:00 PM 75.680611 33.403080
11030 NBC Daily http://www.nbcdaily.com/separate-landslides-ki... 10973 12/15/2016 05:00:00 AM NaN Landslide at Pub Sarania Hill An octogenarian was killed when a sudden lands... Pub Sarania Hill, Guwahati, Assam, India 1km landslide unknown small urban 1.0 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 07/26/2017 01:22:00 PM 12/08/2017 08:37:00 PM 02/15/2018 03:51:00 PM 91.772042 26.181606
11031 AGU Landslide Blog http://blogs.agu.org/landslideblog/2017/05/02/... 10901 04/29/2017 07:03:00 PM NaN Mayor landslide at Ayu village Landslide triggered by heavy rainfall buried 1... Ayu, Ozgon, Osh, Kyrgyzstan 1km translational_slide downpour large natural_slope 24.0 NaN NaN http://blogs.agu.org/landslideblog/files/2017/... NaN NaN NaN NaN NaN NaN NaN NaN NaN 07/14/2017 07:03:00 PM 12/07/2017 09:19:00 PM 02/15/2018 03:51:00 PM 73.472379 40.886395
11032 The Times of India https://timesofindia.indiatimes.com/city/hyder... 10949 03/13/2017 02:32:00 PM NaN Kondapur Commercial Complex Construction Mudslide A mudslide at an under-construction commercial... Hyderabad, Rangareddy, Telangana 1km mudslide construction small urban 2.0 0.0 NaN https://timesofindia.indiatimes.com/thumb/msid... NaN NaN NaN NaN NaN NaN NaN NaN NaN 10/05/2017 02:32:00 PM 12/08/2017 07:57:00 PM 02/15/2018 03:51:00 PM 78.356505 17.465630
In [ ]:
# display shape of the data

df.shape
Out[ ]:
(11033, 31)
In [ ]:
# display features data types

df.dtypes
Out[ ]:
source_name                   object
source_link                   object
event_id                       int64
event_date                    object
event_time                   float64
event_title                   object
event_description             object
location_description          object
location_accuracy             object
landslide_category            object
landslide_trigger             object
landslide_size                object
landslide_setting             object
fatality_count               float64
injury_count                 float64
storm_name                    object
photo_link                    object
notes                         object
event_import_source           object
event_import_id              float64
country_name                  object
country_code                  object
admin_division_name           object
admin_division_population    float64
gazeteer_closest_point        object
gazeteer_distance            float64
submitted_date                object
created_date                  object
last_edited_date              object
longitude                    float64
latitude                     float64
dtype: object

Data Cleaning

In [ ]:
# drop unwanted columns

df.drop(['event_id','event_time','location_description','event_title','event_description','photo_link',	'notes',	
         'event_import_source'	,'event_import_id','country_code','submitted_date',	'created_date',	'last_edited_date'],
        axis=1,
        inplace=True)
In [ ]:
# print available coumns after drop unwanted columns

for i in df.columns:
  print(i)
source_name
source_link
event_date
location_accuracy
landslide_category
landslide_trigger
landslide_size
landslide_setting
fatality_count
injury_count
storm_name
country_name
admin_division_name
admin_division_population
gazeteer_closest_point
gazeteer_distance
longitude
latitude
In [ ]:
# checking for null values

df.isnull().sum()
Out[ ]:
source_name                      0
source_link                    846
event_date                       0
location_accuracy                2
landslide_category               1
landslide_trigger               23
landslide_size                   9
landslide_setting               69
fatality_count                1385
injury_count                  5674
storm_name                   10456
country_name                  1562
admin_division_name           1637
admin_division_population     1562
gazeteer_closest_point        1563
gazeteer_distance             1562
longitude                        0
latitude                         0
dtype: int64
In [ ]:
# change data type of 'event_date' Column

df['event_date_cal'] = pd.to_datetime(df['event_date'])
In [ ]:
# split date & time in to separate columns

df['Date'] = pd.to_datetime(df['event_date_cal']).dt.date
df['Time'] = pd.to_datetime(df['event_date_cal']).dt.time
In [ ]:
df.drop(['event_date','event_date_cal'],
        axis=1,
        inplace=True)
In [ ]:
# display the result

df.head(2)
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time
0 AGU https://blogs.agu.org/landslideblog/2008/10/14... unknown landslide rain large mine 11.0 NaN NaN China Shaanxi 0.0 Jingyang 41.02145 107.450 32.5625 2008-08-01 00:00:00
1 Oregonian http://www.oregonlive.com/news/index.ssf/2009/... 5km mudslide downpour small unknown 0.0 NaN NaN United States Oregon 36619.0 Lake Oswego 0.60342 -122.663 45.4200 2009-01-02 02:00:00

Exploratory Data Analysis & Visualization

Event Reported Source

In [ ]:
Reported_source = pd.DataFrame(df['source_name'].value_counts().head(15)).reset_index()
Reported_source.columns = ['Source Name','Reported Count']
Reported_source
Out[ ]:
Source Name Reported Count
0 Oregon DOT 768
1 maps.google.com 104
2 thehimalayantimes 75
3 news.xinhuanet 74
4 newsinfo.inquirer 71
5 thejakartapost 59
6 ibnlive.in 57
7 Times of India 47
8 The Jakarta Post 46
9 The Himalayan Times 43
10 articles.timesofindia.indiatimes.com 42
11 Seattle Times 41
12 laht 40
13 The Hindu 38
14 GMA News 38
In [ ]:
# visualize source reported times

plt.figure(figsize=(18,12))
sns.barplot(x="Reported Count", y="Source Name", 
            data=Reported_source,
            palette="gist_earth")

plt.xticks(size=12)
plt.title('Reported Sources By Number Of Reported Times',size=16)
plt.xlabel('Reported Times',size=10)
plt.show()

Origon DOT Have Reported Huge Number Of Events During This Time Period ( 1988 - 2017 )

Geospatial Visualization Of Globally Events


Open Streat Map Style

In [ ]:
# Create the map
map_1 = folium.Map(location=[51.1657,10.4515], tiles='cartodbpositron', zoom_start=2) 

mc1 = MarkerCluster()

for idx, row in df.iterrows(): 

     if not math.isnan(row['longitude']) and not math.isnan(row['latitude']):

        mc1.add_child(Marker(location=[row['latitude'], row['longitude']]))

#add child to the map                                     
map_1.add_child(mc1)

# Display the map
map_1
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Heat Map Style

In [ ]:
# Create the map
map_2 = folium.Map(location=[51.1657,10.4515], zoom_start=2) 

# List comprehension to make out list of lists
heat_data = [[row['latitude'],row['longitude']] for index, row in df.iterrows()]

# Plot it on the map
HeatMap(heat_data).add_to(map_2)

minimap = plugins.MiniMap()
map_2.add_child(minimap)


# Display the map
map_2
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Accoring to the above maps we can determine that lot of land slide events happend in ,

  • Indial Ocean
  • North America
  • South America

Events By Years

In [ ]:
# max available date

df['Date'].max()
Out[ ]:
datetime.date(2017, 9, 28)
In [ ]:
# minamum available data 

df['Date'].min()
Out[ ]:
datetime.date(1988, 11, 7)
In [ ]:
# split year from date
df['year'] = pd.to_datetime(df['Date']).dt.year
In [ ]:
# varify the resuly

df.head(2)
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time year
0 AGU https://blogs.agu.org/landslideblog/2008/10/14... unknown landslide rain large mine 11.0 NaN NaN China Shaanxi 0.0 Jingyang 41.02145 107.450 32.5625 2008-08-01 00:00:00 2008
1 Oregonian http://www.oregonlive.com/news/index.ssf/2009/... 5km mudslide downpour small unknown 0.0 NaN NaN United States Oregon 36619.0 Lake Oswego 0.60342 -122.663 45.4200 2009-01-02 02:00:00 2009
In [ ]:
#group by yeras
gr_by_years = pd.DataFrame(df.groupby('year')['source_name'].count().reset_index())

#change columns names
gr_by_years.columns = ['year','Occured_Events']
gr_by_years
Out[ ]:
year Occured_Events
0 1988 1
1 1993 1
2 1995 1
3 1996 2
4 1997 10
5 1998 12
6 2003 2
7 2004 1
8 2005 2
9 2006 13
10 2007 412
11 2008 553
12 2009 423
13 2010 1536
14 2011 1324
15 2012 794
16 2013 1132
17 2014 1035
18 2015 1341
19 2016 1183
20 2017 1255
In [ ]:
# Visualize Occured Events By Years

fig, ax = plt.subplots(1, 1, figsize=[18, 8])
ax.plot(gr_by_years['year'], gr_by_years['Occured_Events'])

plt.xlabel('Years',size=15)
plt.ylabel('Count',size=15)

plt.legend(['Occured Events'], loc=2)
ax.set_title('Occured Events By Years',size=17)
Out[ ]:
Text(0.5, 1.0, 'Occured Events By Years')

Events By Months

In [ ]:
# split month from date
df['month'] = pd.to_datetime(df['Date']).dt.month

#check the result
df.head(2)
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time year month
0 AGU https://blogs.agu.org/landslideblog/2008/10/14... unknown landslide rain large mine 11.0 NaN NaN China Shaanxi 0.0 Jingyang 41.02145 107.450 32.5625 2008-08-01 00:00:00 2008 8
1 Oregonian http://www.oregonlive.com/news/index.ssf/2009/... 5km mudslide downpour small unknown 0.0 NaN NaN United States Oregon 36619.0 Lake Oswego 0.60342 -122.663 45.4200 2009-01-02 02:00:00 2009 1
In [ ]:
#group by months
gr_by_months = pd.DataFrame(df.groupby('month')['source_name'].count().reset_index())

#change columns names
gr_by_months.columns = ['month','Occured_Events']
gr_by_months
Out[ ]:
month Occured_Events
0 1 968
1 2 805
2 3 987
3 4 843
4 5 796
5 6 988
6 7 1294
7 8 1149
8 9 887
9 10 736
10 11 643
11 12 937
In [ ]:
# Visualize Event occured Count by months

fig, ax = plt.subplots(1, 1, figsize=[18, 8])
ax.plot(gr_by_months['month'], gr_by_months['Occured_Events'])

plt.xlabel('Months',size=15)
plt.ylabel('Count',size=15)

plt.legend(['Occured Events'], loc=2)
ax.set_title('Occured Events By Months',size=17)
Out[ ]:
Text(0.5, 1.0, 'Occured Events By Months')

We Can See There Are Most Event Occured During 3 Quater Of The year

Events In 2010

In [ ]:
# Filter events in 2010

year_2010 = pd.DataFrame(df[(df['year'] == 2010)])
year_2010
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time year month
4 The Freeman http://www.philstar.com/cebu-news/621414/lands... 5km landslide tropical_cyclone medium unknown 0.0 NaN Supertyphoon Juan (Megi) Philippines Central Visayas 798634.0 Cebu City 2.02204 123.897800 10.333600 2010-10-16 12:00:00 2010 10
56 CNN http://www.cnn.com/2010/WORLD/americas/01/01/b... 25km mudslide downpour large unknown 22.0 28.0 NaN Brazil Rio de Janeiro 153635.0 Angra dos Reis 2.58880 -44.322498 -23.013744 2010-01-01 00:00:00 2010 1
58 globo http://g1.globo.com/Noticias/Rio/0,,MUL1560027... 1km landslide downpour medium unknown 0.0 NaN NaN Brazil Rio de Janeiro 456456.0 Niterói 2.64466 -43.074098 -22.892344 2010-04-05 00:00:00 2010 4
59 E-Pao http://www.e-pao.net/GP.asp?src=28..210810.aug10 25km landslide downpour medium unknown 0.0 NaN NaN India Manipur 15118.0 Phek 39.51186 94.482147 25.309863 2010-07-29 23:00:00 2010 7
60 Dawn https://www.dawn.com/news/551761 5km landslide downpour medium unknown 0.0 NaN NaN Pakistan Gilgit-Baltistan 2005.0 Barishāl 41.15433 74.860900 36.656700 2010-08-06 00:00:00 2010 8
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
9823 tv.repubblica http://tv.repubblica.it/edizione/milano/il-mal... 5km landslide downpour medium unknown 0.0 NaN NaN Italy Liguria 1392.0 Vezzano Ligure 0.89744 9.879200 44.147300 2010-10-31 23:00:00 2010 10
9825 9wsyr http://www.9wsyr.com/mostpopular/story/Cortlan... 5km mudslide tropical_cyclone small unknown 0.0 NaN Tropical Storm Nicole United States New York 1053.0 McGraw 6.65468 -76.090900 42.536300 2010-09-30 00:00:00 2010 9
9827 dawn http://www.dawn.com/wps/wcm/connect/dawn-conte... 25km landslide rain large unknown 17.0 NaN NaN Pakistan Khyber Pakhtunkhwa 0.0 Alpūrai 12.05629 72.705600 35.011100 2010-07-29 00:00:00 2010 7
9857 fresnobee http://www.fresnobee.com/local/story/1793572.html 10km mudslide downpour small unknown 0.0 NaN NaN United States California 1295.0 Shandon 24.12478 -120.152100 35.774000 2010-01-22 00:00:00 2010 1
9858 maps.google.com http://maps.google.com.br/maps/ms?source=embed... exact landslide downpour medium unknown 0.0 NaN NaN Brazil Rio de Janeiro 147281.0 Nilópolis 12.04688 -43.385700 -22.913100 2010-04-06 00:00:00 2010 4

1536 rows × 21 columns

Visualize Events Occured In 2010 In Gepspatial Map

In [ ]:
# Create the map
map_3 = folium.Map(location=[51.1657,10.4515], zoom_start=2) 

# List comprehension to make out list of lists
heat_data2 = [[row['latitude'],row['longitude']] for index, row in year_2010.iterrows()]

# Plot it on the map
HeatMap(heat_data2).add_to(map_3)

# Display the map
map_3
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

In 2010 More Events Occured In Indian Ocean

Events In 2017

In [ ]:
year_2017 = pd.DataFrame(df[(df['year'] == 2017)])
year_2017
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time year month
13 Vietnamnet http://english.vietnamnet.vn/fms/society/18126... 10km landslide downpour small above_road 0.0 0.0 NaN NaN NaN NaN NaN NaN 105.529997 23.143830 2017-07-03 14:33:00 2017 7
42 WYMT http://www.wymt.com/content/news/Community-pul... 1km mudslide rain small urban 0.0 0.0 NaN NaN NaN NaN NaN NaN -83.712126 36.770029 2017-05-19 20:14:00 2017 5
53 Vietnamnet http://english.vietnamnet.vn/fms/society/18126... 25km landslide downpour small above_road 0.0 0.0 NaN NaN NaN NaN NaN NaN 104.453085 22.690403 2017-07-03 14:33:00 2017 7
73 Nation News http://www.nationnews.com/nationnews/news/9742... 10km landslide tropical_cyclone medium natural_slope 2.0 NaN Tropical Storm Beatriz NaN NaN NaN NaN NaN -96.222559 16.098800 2017-06-02 13:34:00 2017 6
77 KMPH http://kmph.com/news/local/yosemite-park-entra... 25km rock_fall unknown small above_road 0.0 0.0 NaN NaN NaN NaN NaN NaN -119.779508 37.676076 2017-01-06 14:34:00 2017 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
11027 St. Maries Gazette Record http://www.gazetterecord.com/news/article_ac60... exact mudslide rain small above_road 0.0 0.0 NaN NaN NaN NaN NaN NaN -116.777680 47.449165 2017-03-23 16:36:00 2017 3
11028 The Jakarta Post http://www.thejakartapost.com/news/2017/04/02/... 5km landslide rain medium natural_slope 27.0 0.0 NaN NaN NaN NaN NaN NaN 111.679944 -7.853409 2017-04-01 13:34:00 2017 4
11029 Greater Kashmir http://www.greaterkashmir.com/news/jammu/lands... 5km landslide other small natural_slope 2.0 0.0 NaN NaN NaN NaN NaN NaN 75.680611 33.403080 2017-03-25 17:32:00 2017 3
11031 AGU Landslide Blog http://blogs.agu.org/landslideblog/2017/05/02/... 1km translational_slide downpour large natural_slope 24.0 NaN NaN NaN NaN NaN NaN NaN 73.472379 40.886395 2017-04-29 19:03:00 2017 4
11032 The Times of India https://timesofindia.indiatimes.com/city/hyder... 1km mudslide construction small urban 2.0 0.0 NaN NaN NaN NaN NaN NaN 78.356505 17.465630 2017-03-13 14:32:00 2017 3

1255 rows × 21 columns

Visualize Events Occured In 2017 In Gepspatial Map

In [ ]:
# Create the map
map_4 = folium.Map(location=[51.1657,10.4515], zoom_start=2) 

# List comprehension to make out list of lists
heat_data3 = [[row['latitude'],row['longitude']] for index, row in year_2017.iterrows()]

# Plot it on the map
HeatMap(heat_data3).add_to(map_4)

# Display the map
map_4
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

In 2017 Most Events Occured In Indian Ocean & Also Have Higher Count In North and South America

Events By Category

In [ ]:
def event_by_category():

  #print value count
  print(df['landslide_category'].value_counts()) 

  #print in countplot graph
  plt.figure(figsize=(18,8.5))
  sns.countplot(df['landslide_category'],palette='Set2')
  plt.xticks(rotation='vertical',size=15)
  plt.title('By Landslide Category',size=16)
  plt.xlabel('',size=10)
  plt.show()


event_by_category()
landslide              7648
mudslide               2100
rock_fall               671
complex                 232
debris_flow             194
other                    68
unknown                  38
riverbank_collapse       37
snow_avalanche           15
translational_slide       9
earth_flow                7
lahar                     7
creep                     5
topple                    1
Name: landslide_category, dtype: int64

Events By Size

In [ ]:
def event_by_size():

  #print value count
  print(df['landslide_size'].value_counts()) 

  #print in countplot graph
  plt.figure(figsize=(18,8.5))
  sns.countplot(df['landslide_size'],palette='Set2_r')
  plt.xticks(rotation='vertical',size=15)
  plt.title('By Size',size=16)
  plt.xlabel('',size=10)
  plt.show()


event_by_size()
medium          6551
small           2767
unknown          851
large            750
very_large       102
catastrophic       3
Name: landslide_size, dtype: int64

Very Large Events Geospatial Visualization

In [ ]:
size_large = pd.DataFrame(df[(df['landslide_size'] == 'very_large')])
size_large
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time year month
65 Google Earth NaN 10km mudslide downpour very_large unknown 0.0 NaN NaN Brazil Rio de Janeiro 153361.0 Nova Friburgo 9.07433 -42.601800 -22.233100 2011-01-12 00:00:00 2011 1
167 Local News 8 http://www.localnews8.com/news/remote-landslid... 25km landslide no_apparent_trigger very_large natural_slope 0.0 0.0 NaN NaN NaN NaN NaN NaN -109.484768 43.218436 2017-07-05 19:39:00 2017 7
198 Brudirect https://www.brudirect.com/news.php?id=12111 exact mudslide downpour very_large urban 0.0 0.0 NaN NaN NaN NaN NaN NaN 106.801076 -6.311708 2016-08-19 17:15:00 2016 8
252 LA Times http://www.latimes.com/local/lanow/la-me-ln-bi... exact landslide rain very_large above_road 0.0 0.0 NaN NaN NaN NaN NaN NaN -121.432384 35.865628 2017-05-20 13:34:00 2017 5
355 NewsFlare https://www.newsflare.com/video/83244/weather-... 5km landslide rain very_large natural_slope 0.0 0.0 NaN NaN NaN NaN NaN NaN 12.132183 46.539613 2016-08-16 17:45:00 2016 8
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
10459 The Watchers https://watchers.news/2017/02/26/massive-mine-... exact landslide rain very_large mine 0.0 0.0 NaN NaN NaN NaN NaN NaN 18.077692 44.143538 2017-02-24 19:45:00 2017 2
10622 CNN http://www.cnn.com/2017/02/06/asia/avalanche-a... 25km snow_avalanche snowfall_snowmelt very_large natural_slope 50.0 NaN NaN NaN NaN NaN NaN NaN 71.305578 35.727633 2017-02-06 21:44:00 2017 2
10783 The Bubble http://www.thebubble.com/two-dead-and-more-tha... 1km mudslide downpour very_large natural_slope 2.0 5.0 NaN NaN NaN NaN NaN NaN -65.467344 -23.916049 2017-01-10 17:22:00 2017 1
10845 CBC News http://www.cbc.ca/news/world/colombia-mudslide... 1km landslide rain very_large above_road 17.0 0.0 NaN NaN NaN NaN NaN NaN -75.499849 6.358246 2016-10-26 14:24:00 2016 10
11012 FloodList http://floodlist.com/asia/indonesia-floods-lan... 1km landslide downpour very_large above_river 4.0 2.0 NaN NaN NaN NaN NaN NaN 101.458434 0.541576 2017-03-03 20:35:00 2017 3

102 rows × 21 columns

In [ ]:
map_5 = folium.Map(location=[51.1657,10.4515], zoom_start=2) 

# List comprehension to make out list of lists
heat_data4 = [[row['latitude'],row['longitude']] for index, row in size_large.iterrows()]

# Plot it on the map
HeatMap(heat_data4).add_to(map_5)

# Display the map
map_5
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Events By Settings Type

In [ ]:
def event_by_setting():

  #print value count
  print(df['landslide_setting'].value_counts()) 

  #print in countplot graph
  plt.figure(figsize=(18,8.5))
  sns.countplot(df['landslide_setting'],palette='YlGnBu_r')
  plt.xticks(rotation='vertical',size=15)
  plt.title('Event By Settings Type',size=16)
  plt.xlabel('',size=10)
  plt.show()


event_by_setting()
unknown             6291
above_road          3104
natural_slope        531
urban                264
below_road           199
mine                 157
above_river          149
deforested_slope      53
other                 50
bluff                 48
retaining_wall        48
burned_area           28
engineered_slope      22
above_coast           20
Name: landslide_setting, dtype: int64

Accoring to above map we can determine lots of events happend in above Roads setting. so roads contructions cause to these events most. if we can contruct roads with more safety & pre analysis we can reduce these events happening

By Event TrigGed Type

In [ ]:
def event_by_triger():

  #print value count
  print(df['landslide_trigger'].value_counts()) 

  #print in countplot graph
  plt.figure(figsize=(18,8.5))
  sns.countplot(df['landslide_trigger'],palette='crest')
  plt.xticks(rotation='vertical',size=15)
  plt.title('Event By Triger Type',size=16)
  plt.xlabel('',size=10)
  plt.show()


event_by_triger()
downpour                   4680
rain                       2592
unknown                    1691
continuous_rain             748
tropical_cyclone            561
snowfall_snowmelt           135
monsoon                     129
mining                       93
earthquake                   89
construction                 82
flooding                     75
no_apparent_trigger          44
freeze_thaw                  41
other                        26
dam_embankment_collapse      12
leaking_pipe                 10
vibration                     1
volcano                       1
Name: landslide_trigger, dtype: int64

Events Occurred Due To Earth Quakes

In [ ]:
size_eth = pd.DataFrame(df[(df['landslide_trigger'] == 'earthquake')])
size_eth
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time year month
299 CBS News http://www.cbsnews.com/news/guatemala-earthqua... 25km landslide earthquake medium natural_slope 0.0 0.0 NaN NaN NaN NaN NaN NaN -91.896976 15.122773 2017-06-14 13:27:00 2017 6
348 Bluefield Daily Telegraph http://www.bdtonline.com/news/crews-work-to-st... 1km landslide earthquake small natural_slope 0.0 0.0 NaN NaN NaN NaN NaN NaN -80.891116 37.357183 2017-05-18 13:15:00 2017 5
658 gns http://www.gns.cri.nz/static/pubs/2013/SR%2020... exact rock_fall earthquake small natural_slope 0.0 0.0 NaN New Zealand Canterbury 1187.0 Pleasant Point 108.91108 170.179700 -43.569800 2013-01-05 14:00:00 2013 1
757 NBC News http://www.nbcnews.com/slideshow/stronger-japa... 25km rock_fall earthquake small above_road 0.0 0.0 NaN Japan Kumamoto 26677.0 Kikuchi 1.11260 130.826600 32.977800 2016-04-16 00:00:00 2016 4
767 Nation http://nation.com.pk/national/01-Oct-2016/man-... 5km landslide earthquake medium unknown 0.0 0.0 NaN Pakistan Khyber Pakhtunkhwa 0.0 Athmuqam 42.62052 73.655800 34.900200 2016-10-01 00:00:00 2016 10
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
10367 Manila Bulletin http://news.mb.com.ph/2017/07/08/aftershocks-t... 5km landslide earthquake medium unknown 1.0 0.0 NaN NaN NaN NaN NaN NaN 124.592383 11.020323 2017-07-06 13:23:00 2017 7
10524 Rappler http://www.rappler.com/nation/175038-landslide... 10km landslide earthquake medium natural_slope 0.0 26.0 NaN NaN NaN NaN NaN NaN 124.653260 11.209318 2017-07-06 13:23:00 2017 7
10694 Malaysian Digest http://www.malaysiandigest.com/world/669146-ea... 1km rock_fall earthquake small above_road 1.0 2.0 NaN NaN NaN NaN NaN NaN -89.335196 13.701013 2017-04-11 13:34:00 2017 4
10781 GMA News Online http://www.gmanetwork.com/news/news/regions/60... 5km landslide earthquake small unknown 0.0 0.0 NaN NaN NaN NaN NaN NaN 120.910893 13.976411 2017-04-08 13:19:00 2017 4
10888 SunStar Manila http://www.sunstar.com.ph/manila/local-news/20... 5km landslide earthquake small above_road 0.0 0.0 NaN NaN NaN NaN NaN NaN 121.104998 13.626157 2017-04-08 20:28:00 2017 4

89 rows × 21 columns

Number Of Earth Quakes Events

In [ ]:
size_eth.size
Out[ ]:
1869

Earth Quakes Triggerd Types

In [ ]:
def earth_quake_setting():

  #print value count
  print(size_eth['landslide_setting'].value_counts()) 

  #print in countplot graph
  plt.figure(figsize=(18,8.5))
  sns.countplot(size_eth['landslide_setting'],palette='Set2_r')
  plt.xticks(rotation='vertical',size=15)
  plt.title('Earth Quakes Trigerd settings',size=16)
  plt.xlabel('',size=10)
  plt.show()


earth_quake_setting()
unknown             34
above_road          23
natural_slope       22
above_river          4
urban                2
deforested_slope     1
retaining_wall       1
mine                 1
below_road           1
Name: landslide_setting, dtype: int64

According to the above graph there are unknow details there are high numbers in "above road" & "natural slope"

Earth Quakes Events Geospatial Visulization

In [ ]:
map_6 = folium.Map(location=[51.1657,10.4515], zoom_start=2) 

# List comprehension to make out list of lists
heat_data5 = [[row['latitude'],row['longitude']] for index, row in size_eth.iterrows()]

# Plot it on the map
HeatMap(heat_data5).add_to(map_6)

# Display the map
map_6
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Accoring to the above map we can see these Earthquakes are trigerd near to earth plates & lots of number os earth quakes are happend with unknown reason. we can determine this unknown reasion as earth plates movements because these events happend between them. So earth plates are course to trigger Earthquakes most

Top 10 Countries By Events

In [ ]:
df['country_name'].value_counts().head(10)
Out[ ]:
United States     2992
India             1265
Philippines        675
Nepal              481
China              426
Indonesia          355
United Kingdom     229
Brazil             214
Canada             174
Malaysia           171
Name: country_name, dtype: int64

Fatalities & Injuries

Highet Fatalities Event Details

In [ ]:
df[(df['fatality_count'] == df['fatality_count'].max())]
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time year month
5694 blogs.agu.org http://blogs.agu.org/landslideblog/2013/06/21/... 1km debris_flow downpour very_large unknown 5000.0 NaN NaN India Uttarakhand 2000.0 Pīpalkoti 48.96196 79.0666 30.7351 2013-06-16 19:30:00 2013 6

Highest injuries Event Details

In [ ]:
df[(df['injury_count'] == df['injury_count'].max())]
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time year month
9097 Euro News http://www.euronews.com/2015/10/02/fatal-lands... 5km landslide downpour large unknown 253.0 374.0 NaN Guatemala Guatemala 994938.0 Guatemala City 3.11095 -90.4847 14.6367 2015-08-02 00:00:00 2015 8

Fatalities & Injuries Descriptive Statistics

In [ ]:
df[['fatality_count','injury_count']].describe()
Out[ ]:
fatality_count injury_count
count 9648.000000 5359.000000
mean 3.219424 0.751819
std 59.886178 8.458955
min 0.000000 0.000000
25% 0.000000 0.000000
50% 0.000000 0.000000
75% 1.000000 0.000000
max 5000.000000 374.000000

Top 15 Countries With Events Fatalities & Injuries

In [ ]:
group_by_country = pd.DataFrame(df.groupby('country_name').sum()[['fatality_count','injury_count']].reset_index())

group_by_country_sort = group_by_country.sort_values('fatality_count',ascending=False)
group_by_country_sort.head(15)
Out[ ]:
country_name fatality_count injury_count
56 India 7069.0 217.0
26 China 4945.0 318.0
0 Afghanistan 2294.0 1.0
98 Philippines 1847.0 138.0
17 Brazil 1743.0 103.0
57 Indonesia 1697.0 149.0
86 Nepal 1473.0 277.0
50 Guatemala 743.0 408.0
93 Pakistan 662.0 70.0
27 Colombia 591.0 84.0
123 Taiwan 540.0 19.0
130 Uganda 534.0 70.0
84 Myanmar [Burma] 498.0 228.0
10 Bangladesh 426.0 170.0
138 Vietnam 375.0 20.0

Fatalities & Injuries By Each Triger Type

In [ ]:
group_by_tr = pd.DataFrame(df.groupby('landslide_category').sum()[['fatality_count','injury_count']].reset_index())

group_by_tr_sort = group_by_tr.sort_values('fatality_count',ascending=False)
group_by_tr_sort
Out[ ]:
landslide_category fatality_count injury_count
5 landslide 16912.0 2703.0
2 debris_flow 5770.0 407.0
6 mudslide 5624.0 404.0
0 complex 2139.0 37.0
9 rock_fall 319.0 292.0
10 snow_avalanche 112.0 36.0
13 unknown 74.0 99.0
7 other 58.0 37.0
12 translational_slide 27.0 1.0
4 lahar 12.0 0.0
8 riverbank_collapse 7.0 8.0
3 earth_flow 5.0 3.0
1 creep 0.0 0.0
11 topple 0.0 0.0
In [ ]:
group_by_tr_sort.plot(x="landslide_category", y=["fatality_count", "injury_count"], kind="bar",figsize=(22,8))
plt.xticks(rotation='vertical',size=15)
plt.title('Fatalities & Injuries By Each Event Caregory',size=15)
plt.show
Out[ ]:
<function matplotlib.pyplot.show>

Event Occured Due to Storms

In [ ]:
storms = pd.DataFrame(df[df['storm_name'].notnull()])
storms
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time year month
4 The Freeman http://www.philstar.com/cebu-news/621414/lands... 5km landslide tropical_cyclone medium unknown 0.0 NaN Supertyphoon Juan (Megi) Philippines Central Visayas 798634.0 Cebu City 2.02204 123.897800 10.333600 2010-10-16 12:00:00 2010 10
7 Crónica Diaria http://www.cronica.com.mx/notas/2007/320673.html 10km complex tropical_cyclone medium unknown 3.0 NaN Tropical Storm Henrietta Mexico Sinaloa 3191.0 El Limón de los Ramos 10.88351 -107.622000 24.953100 2007-09-02 00:00:00 2007 9
61 The Freeman http://www.philstar.com/cebu-news/621414/lands... 25km landslide tropical_cyclone medium unknown 0.0 NaN Supertyphoon Juan (Megi) Philippines Central Visayas 48638.0 Consolacion 5.02950 123.915300 10.395000 2010-10-16 00:00:00 2010 10
63 The Freeman http://www.philstar.com/cebu-news/621414/lands... 25km landslide tropical_cyclone medium unknown 0.0 NaN Supertyphoon Juan (Megi) Philippines Central Visayas 7779.0 Asturias 0.54750 123.720000 10.570000 2010-10-17 00:00:00 2010 10
64 The Freeman http://www.philstar.com/cebu-news/621414/lands... 10km landslide tropical_cyclone medium unknown 0.0 NaN Supertyphoon Juan (Megi) Philippines Central Visayas 2968.0 Sogod 2.66226 123.980000 10.770000 2010-10-17 00:00:00 2010 10
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
10877 The Telegraph http://www.telegraph.co.uk/news/2017/02/08/fal... exact topple downpour medium above_coast 0.0 0.0 Storm Doris NaN NaN NaN NaN NaN -3.231704 50.679618 2017-02-08 18:14:00 2017 2
10904 NBC News http://www.nbcnews.com/news/asia/heavy-floodin... 10km landslide tropical_cyclone large above_road 1.0 0.0 Typhoon Nanmadol NaN NaN NaN NaN NaN 130.937535 33.386309 2017-07-06 13:16:00 2017 7
10939 The Daily Examiner https://www.dailyexaminer.com.au/news/land-sli... 10km landslide tropical_cyclone small above_river 0.0 0.0 Cyclone Debbie NaN NaN NaN NaN NaN 153.143192 -29.472252 2017-04-04 13:16:00 2017 4
10976 NBC News http://www.nbcnews.com/news/asia/heavy-floodin... 25km landslide tropical_cyclone medium natural_slope 0.0 0.0 Typhoon Nanmadol NaN NaN NaN NaN NaN 130.743680 33.417091 2017-07-06 13:16:00 2017 7
11003 The Times of India http://timesofindia.indiatimes.com/city/coimba... 10km landslide tropical_cyclone small above_road 0.0 0.0 Cyclone Vardah NaN NaN NaN NaN NaN 76.903045 11.328578 2016-12-16 13:22:00 2016 12

577 rows × 21 columns

Storms Events Geospatial Visualzation

In [ ]:
map_7 = folium.Map(location=[51.1657,10.4515], tiles='Stamen Toner', zoom_start=2) 

# List comprehension to make out list of lists
heat_data6 = [[row['latitude'],row['longitude']] for index, row in storms.iterrows()]

# Plot it on the map
HeatMap(heat_data6).add_to(map_7)

# Display the map
map_7
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Top 10 Storms Course to Occured Events

In [ ]:
df['storm_name'].value_counts().head(10)
Out[ ]:
Supertyphoon Juan (Megi)      32
Tropical Depression Parma     23
Agaton                        20
Tropical Depression Urduja    15
Tropical Storm Tomas          14
Hurricane Tomas               13
Tropical Cyclone Agatha       12
Trami                         12
Lawin                         10
Utor                          10
Name: storm_name, dtype: int64

Fatalities & Injeries In Due To The Top 10 Occured Storms

In [ ]:
storms = ['Supertyphoon Juan (Megi)','Tropical Depression Parma',
          'Agaton','Tropical Depression Urduja','Tropical Storm Tomas',          
          'Hurricane Tomas','Tropical Cyclone Agatha','Trami','Lawin','Utor']


storms_df = pd.DataFrame(df[(df['storm_name'] == storms[0]) | (df['storm_name'] == storms[1]) | (df['storm_name'] == storms[2]) | (df['storm_name'] == storms[3]) | (df['storm_name'] == storms[4]) | (df['storm_name'] == storms[5]) | (df['storm_name'] == storms[6]) | (df['storm_name'] == storms[7]) | (df['storm_name'] == storms[8]) | (df['storm_name'] == storms[9])])
storms_df
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time year month
4 The Freeman http://www.philstar.com/cebu-news/621414/lands... 5km landslide tropical_cyclone medium unknown 0.0 NaN Supertyphoon Juan (Megi) Philippines Central Visayas 798634.0 Cebu City 2.02204 123.8978 10.3336 2010-10-16 12:00:00 2010 10
61 The Freeman http://www.philstar.com/cebu-news/621414/lands... 25km landslide tropical_cyclone medium unknown 0.0 NaN Supertyphoon Juan (Megi) Philippines Central Visayas 48638.0 Consolacion 5.02950 123.9153 10.3950 2010-10-16 00:00:00 2010 10
63 The Freeman http://www.philstar.com/cebu-news/621414/lands... 25km landslide tropical_cyclone medium unknown 0.0 NaN Supertyphoon Juan (Megi) Philippines Central Visayas 7779.0 Asturias 0.54750 123.7200 10.5700 2010-10-17 00:00:00 2010 10
64 The Freeman http://www.philstar.com/cebu-news/621414/lands... 10km landslide tropical_cyclone medium unknown 0.0 NaN Supertyphoon Juan (Megi) Philippines Central Visayas 2968.0 Sogod 2.66226 123.9800 10.7700 2010-10-17 00:00:00 2010 10
472 Interaksyon NaN 5km landslide tropical_cyclone medium unknown 0.0 0.0 Agaton Philippines Eastern Visayas 2920.0 Calbiga 10.15072 125.0725 11.5515 2014-01-11 00:00:00 2014 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
9648 newsinfo.inquirer http://newsinfo.inquirer.net/inquirerheadlines... 25km landslide tropical_cyclone medium unknown 0.0 NaN Supertyphoon Juan (Megi) Philippines Cagayan Valley 2173.0 Banquero 4.58730 121.7753 16.9817 2010-10-18 00:00:00 2010 10
9750 Panahon Ngayon https://weatherngayon.wordpress.com/2014/01/12... 5km landslide tropical_cyclone medium unknown 0.0 0.0 Agaton Philippines Northern Mindanao 9664.0 Kapatagan 0.62523 123.7746 7.9020 2014-01-11 00:00:00 2014 1
9783 newsday.tt http://newsday.co.tt/news/0,130093.html 5km landslide tropical_cyclone medium unknown 0.0 NaN Hurricane Tomas Trinidad and Tobago Eastern Tobago 0.0 Roxborough 0.91163 -60.5750 11.2505 2010-10-31 09:00:00 2010 10
9853 news.pia.gov.ph http://news.pia.gov.ph/index.php?article=71376... 25km landslide tropical_cyclone medium unknown 1.0 NaN Utor Philippines Cordillera Administrative Region 2653.0 Tabaan 7.07853 120.5811 16.2852 2013-08-12 08:30:00 2013 8
9939 Inquirer.net http://newsinfo.inquirer.net/562551/2000-famil... 10km landslide tropical_cyclone medium unknown 0.0 0.0 Agaton Philippines Davao 16671.0 Nabunturan 0.86072 125.9667 7.6000 2014-01-12 00:00:00 2014 1

161 rows × 21 columns

In [ ]:
# Create the map
map_8 = folium.Map(location=[51.1657,10.4515], tiles='cartodbpositron', zoom_start=3) 


mc2 = MarkerCluster()

for idx, row in storms_df.iterrows(): 

     if not math.isnan(row['longitude']) and not math.isnan(row['latitude']):

        mc2.add_child(Marker(location=[row['latitude'], row['longitude']],tooltip=row['storm_name']))

#add child to the map                                     
map_8.add_child(mc2)

# Display the map
map_8
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Alternative Visulization In Heat Map For Better View

In [ ]:
map_9 = folium.Map(location=[51.1657,10.4515], tiles='Stamen Toner', zoom_start=3) 

# List comprehension to make out list of lists
heat_data7 = [[row['latitude'],row['longitude']] for index, row in storms_df.iterrows()]

# Plot it on the map
HeatMap(heat_data7).add_to(map_9)

# Display the map
map_9
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Most Event Occurred Due To Storms In Around Philiphines & Carabian Sea

Fatalities & Injuries Due To Top 10 Storms

Total Fatalities & Injuries Due To Top 10 Storms

In [ ]:
group_by_storms_df = pd.DataFrame(storms_df.groupby('storm_name').sum()[['fatality_count','injury_count']].reset_index())

group_by_storms_df_sort = group_by_storms_df.sort_values('fatality_count',ascending=False)
group_by_storms_df_sort
Out[ ]:
storm_name fatality_count injury_count
6 Tropical Depression Parma 332.0 0.0
5 Tropical Cyclone Agatha 68.0 0.0
0 Agaton 56.0 10.0
8 Tropical Storm Tomas 26.0 0.0
9 Utor 11.0 4.0
2 Lawin 8.0 0.0
1 Hurricane Tomas 4.0 0.0
4 Trami 3.0 0.0
7 Tropical Depression Urduja 3.0 0.0
3 Supertyphoon Juan (Megi) 1.0 0.0

Average Fatalities & Injuries Due To Top 10 Storms

In [ ]:
group_by_storms_df_avg = pd.DataFrame(storms_df.groupby('storm_name').mean()[['fatality_count','injury_count']].reset_index())

group_by_storms_df_avg_sort = group_by_storms_df_avg.sort_values('fatality_count',ascending=False)
group_by_storms_df_avg_sort
Out[ ]:
storm_name fatality_count injury_count
6 Tropical Depression Parma 14.434783 NaN
5 Tropical Cyclone Agatha 5.666667 NaN
0 Agaton 2.800000 0.5
8 Tropical Storm Tomas 1.857143 NaN
9 Utor 1.571429 4.0
4 Trami 1.000000 NaN
2 Lawin 0.800000 0.0
1 Hurricane Tomas 0.307692 NaN
7 Tropical Depression Urduja 0.200000 NaN
3 Supertyphoon Juan (Megi) 0.031250 NaN

Fatalities & Injuries By Years

In [ ]:
#group values by years & aggregated by facilty & injuries count
group_by_years = pd.DataFrame(df.groupby('year').sum()[['fatality_count','injury_count']].reset_index())

#sort values to decending order
group_by_years_sort = group_by_years.sort_values('fatality_count',ascending=False)
group_by_years_sort
Out[ ]:
year fatality_count injury_count
16 2013 6361.0 405.0
13 2010 5328.0 51.0
17 2014 3868.0 800.0
11 2008 2288.0 93.0
18 2015 2267.0 1027.0
14 2011 2146.0 36.0
20 2017 1897.0 642.0
12 2009 1792.0 12.0
10 2007 1734.0 156.0
19 2016 1491.0 779.0
15 2012 1462.0 28.0
9 2006 324.0 0.0
7 2004 100.0 0.0
8 2005 3.0 0.0
1 1993 0.0 0.0
6 2003 0.0 0.0
5 1998 0.0 0.0
4 1997 0.0 0.0
3 1996 0.0 0.0
2 1995 0.0 0.0
0 1988 0.0 0.0

Time Series Of Fatalities & Injuries

In [ ]:
# Visualize Fatalities & Injuries By Years
fig, ax = plt.subplots(1, 1, figsize=[22, 9])

ax.plot(group_by_years['year'], group_by_years['fatality_count'])
ax.plot(group_by_years['year'], group_by_years['injury_count'])

plt.xlabel('Years',size=15)
plt.ylabel('Count',size=15)

plt.legend(['Fatalities Count','Injuries Count'], loc=2)
ax.set_title('Time Series Of Fatalities & Injuries ',size=17)
Out[ ]:
Text(0.5, 1.0, 'Time Series Of Fatalities & Injuries ')

Time Series Of Fatalities & Injuries Vs Occurred Events By Years

In [ ]:
# Visualize Fatalities & Injuries By Years
fig, ax = plt.subplots(1, 1, figsize=[22, 9])

ax.plot(gr_by_years['year'], gr_by_years['Occured_Events'])
ax.plot(group_by_years['year'], group_by_years['fatality_count'])
ax.plot(group_by_years['year'], group_by_years['injury_count'])

plt.xlabel('Years',size=15)
plt.ylabel('Count',size=15)

plt.legend(['Occured Events Count','Fatalities Count','Injuries Count'], loc=2)
ax.set_title('Time Series Of Fatalities & Injuries Vs Occured Events By Years',size=17)
Out[ ]:
Text(0.5, 1.0, 'Time Series Of Fatalities & Injuries Vs Occured Events By Years')

Exploratory Data Analysis & Visualization In Sri Lanka

In [ ]:
srilanka_df = pd.DataFrame(df[(df['longitude'] < 82.0000) & (df['latitude'] < 8.0000) & (df['latitude'] > 6.0000) & (df['longitude'] > 80.0000)])
In [ ]:
srilanka_df.head()
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time year month
224 Sunday Times http://www.sundaytimes.lk/article/1022689/two-... 5km landslide unknown medium urban 6.0 NaN NaN NaN NaN NaN NaN NaN 80.154021 6.519725 2017-05-25 20:14:00 2017 5
288 News Radio.lk https://www.newsradio.lk/2017/05/26/2-landslid... 5km landslide continuous_rain medium urban NaN NaN NaN NaN NaN NaN NaN NaN 80.160486 6.586835 2017-05-26 16:59:00 2017 5
308 Sunday Times http://www.sundaytimes.lk/article/1022689/two-... 5km landslide rain medium urban 9.0 NaN NaN NaN NaN NaN NaN NaN 80.606877 6.093508 2017-05-25 16:59:00 2017 5
344 Daily Mirror http://www.dailymirror.lk/article/Six-feared-b... 5km landslide continuous_rain medium urban 6.0 NaN NaN NaN NaN NaN NaN NaN 80.231427 6.509504 2017-05-25 16:59:00 2017 5
351 News Radio.lk https://www.newsradio.lk/2017/05/26/2-landslid... 5km landslide continuous_rain medium above_road 0.0 0.0 NaN NaN NaN NaN NaN NaN 80.491387 6.260793 2017-05-26 16:59:00 2017 5
In [ ]:
srilanka_df.isnull().sum()
Out[ ]:
source_name                   0
source_link                   0
location_accuracy             0
landslide_category            0
landslide_trigger             0
landslide_size                0
landslide_setting             1
fatality_count               12
injury_count                 62
storm_name                   83
country_name                 11
admin_division_name          11
admin_division_population    11
gazeteer_closest_point       11
gazeteer_distance            11
longitude                     0
latitude                      0
Date                          0
Time                          0
year                          0
month                         0
dtype: int64
In [ ]:
srilanka_df.shape
Out[ ]:
(86, 21)

There Are 86 Events Occurred In Sri Lanka During 1988 To 2017


Occured Events In Sri Lanka Geospatial Visualization

In [ ]:
m11 = plugins.DualMap(location=(7.8731,80.7718), tiles=None, zoom_start=7.5)

folium.TileLayer("openstreetmap").add_to(m11.m1)
folium.TileLayer("Stamen Terrain").add_to(m11.m2)

# map 1.............................................................................................. 
mc15 = MarkerCluster()
for idx, row in srilanka_df.iterrows(): 

     if not math.isnan(row['longitude']) and not math.isnan(row['latitude']):

        mc15.add_child(Marker(location=[row['latitude'], row['longitude']],tooltip="<b> Category : </b>"+ row['landslide_category']+"<br> <b> Trigger By : </b>" +row['landslide_trigger']+"<br> <b> Size : </b>" +row['landslide_size']))
# popup="<b> Category </b>"+ row['landslide_category']+"<br> <b> Trigger </b>" +row['landslide_trigger']+"<br> <b> Size </b>" +row['landslide_size']
#add child to the map                                     
m11.m1.add_child(mc15)

# map 2..............................................................................................

#List comprehension to make out list of lists
heat_data9 = [[row['latitude'],row['longitude']] for index, row in srilanka_df.iterrows()]
#Plot it on the map
HeatMap(heat_data9).add_to(m11.m2)


m11
Out[ ]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Acording To The Above Map,

  • We can See That In Sri Lanka Most Of The Events Mountain Side Areas. Such As Uva Province , Central province ,Sabaragamuwa Province

Lets Analyse These Events Deeply


Time Series Of Occurred Events By Years In Sri Lanka

In [ ]:
#group by yeras
gr_by_years_sl = pd.DataFrame(srilanka_df.groupby('year')['source_name'].count().reset_index())

#change columns names
gr_by_years_sl.columns = ['year','Occured_Events']
gr_by_years_sl
Out[ ]:
year Occured_Events
0 2007 4
1 2008 5
2 2010 10
3 2011 34
4 2012 2
5 2013 3
6 2014 8
7 2015 5
8 2016 6
9 2017 9
In [ ]:
# Visualize event occurred by Year

fig, ax = plt.subplots(1, 1, figsize=[18, 8])
ax.plot(gr_by_years_sl['year'], gr_by_years_sl['Occured_Events'])

plt.xlabel('Years',size=15)
plt.ylabel('Ocured Times',size=15)

plt.legend(['Event Occurred Times'], loc=2)
ax.set_title('Event Occurred Times By Years',size=17)
Out[ ]:
Text(0.5, 1.0, 'Event Occured Times By Years')

Time Series Of Ocured Events By Months In Sri Lanka

In [ ]:
#group by months
gr_by_month_sl = pd.DataFrame(srilanka_df.groupby('month')['source_name'].count().reset_index())

#change columns names
gr_by_month_sl.columns = ['month','Occured_Events']
gr_by_month_sl
Out[ ]:
month Occured_Events
0 1 9
1 2 21
2 3 2
3 4 9
4 5 20
5 6 2
6 8 1
7 9 3
8 10 4
9 11 6
10 12 9
In [ ]:
# Visualize Event occured Count by months

fig, ax = plt.subplots(1, 1, figsize=[18, 8])
ax.plot(gr_by_month_sl['month'], gr_by_month_sl['Occured_Events'])

plt.xlabel('Months',size=15)
plt.ylabel('Count',size=15)

plt.legend(['Occurred Events'], loc=2)
ax.set_title('Occurred Events By Months',size=17)
Out[ ]:
Text(0.5, 1.0, 'Occured Events By Months')

Time Series Of Fatalities & Injuries Comparing To Events Occurred Times In Sri Lanka

In [ ]:
#group values by years & aggregated by facilty & injuries count In Sri Lanka
gr_by_ft_in_sl_yr = pd.DataFrame(srilanka_df.groupby('year').sum()[['fatality_count','injury_count']].reset_index())

#sort values to decending order
gr_by_ft_in_sl_yr_sort = gr_by_ft_in_sl_yr.sort_values('fatality_count',ascending=False)
gr_by_ft_in_sl_yr_sort
Out[ ]:
year fatality_count injury_count
8 2016 116.0 2.0
9 2017 110.0 0.0
6 2014 59.0 1.0
7 2015 17.0 1.0
3 2011 14.0 0.0
1 2008 13.0 0.0
4 2012 6.0 0.0
2 2010 3.0 0.0
0 2007 2.0 0.0
5 2013 1.0 0.0
In [ ]:
# Visualize Fatalities & Injuries By Years
fig, ax = plt.subplots(1, 1, figsize=[22, 9])

ax.plot(gr_by_years_sl['year'], gr_by_years_sl['Occured_Events'])
ax.plot(gr_by_ft_in_sl_yr['year'], gr_by_ft_in_sl_yr['fatality_count'])
ax.plot(gr_by_ft_in_sl_yr['year'], gr_by_ft_in_sl_yr['injury_count'])

plt.xlabel('Years',size=15)
plt.ylabel('Count',size=15)

plt.legend(['Occured Events Count','Fatalities Count','Injuries Count'], loc=2)
ax.set_title('Time Series Of Fatalities & Injuries Vs Occured Events By Years In Sri Lanka',size=17)
Out[ ]:
Text(0.5, 1.0, 'Time Series Of Fatalities & Injuries Vs Occured Events By Years In Sri Lanka')

Details About Occurred Event With Highest Fatalities

In [ ]:
high_ft = pd.DataFrame(srilanka_df[(srilanka_df['fatality_count'] == srilanka_df['fatality_count'].max())])
high_ft
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time year month
3941 Al Jazeera http://www.aljazeera.com/news/2016/05/agony-sr... 1km mudslide monsoon very_large unknown 101.0 0.0 NaN Sri Lanka Central 24730.0 Gampola 12.91294 80.4484 7.1623 2016-05-18 00:00:00 2016 5

Sri Lanka Hit a Biggest Landslide Disaster In 2016 You Can Read The Article Here https://en.wikipedia.org/wiki/2016_Sri_Lankan_floods

Top 10 Fatalities Events

In [ ]:
sl_ft_sort = srilanka_df.sort_values('fatality_count',ascending=False)
sl_ft_sort.head(10)
Out[ ]:
source_name source_link location_accuracy landslide_category landslide_trigger landslide_size landslide_setting fatality_count injury_count storm_name country_name admin_division_name admin_division_population gazeteer_closest_point gazeteer_distance longitude latitude Date Time year month
3941 Al Jazeera http://www.aljazeera.com/news/2016/05/agony-sr... 1km mudslide monsoon very_large unknown 101.0 0.0 NaN Sri Lanka Central 24730.0 Gampola 12.91294 80.448400 7.162300 2016-05-18 00:00:00 2016 5
381 Reuters http://www.reuters.com/article/us-sri-lanka-di... 25km landslide monsoon large unknown 46.0 0.0 NaN NaN NaN NaN NaN NaN 80.384613 6.702273 2017-05-26 15:17:00 2017 5
410 Reuters http://www.reuters.com/article/us-sri-lanka-di... 1km landslide continuous_rain large natural_slope 38.0 0.0 NaN NaN NaN NaN NaN NaN 80.181359 6.518587 2017-05-26 15:17:00 2017 5
5849 Aol http://www.aol.com/article/2014/10/30/hundreds... 1km mudslide monsoon large deforested_slope 37.0 0.0 NaN Sri Lanka Uva 4721.0 Haputale 6.54736 81.020900 6.763100 2014-10-29 07:30:00 2014 10
1668 BBC http://www.bbc.com/news/world-asia-36320724 25km landslide continuous_rain large natural_slope 13.0 0.0 NaN Sri Lanka Central 24730.0 Gampola 13.21917 80.454800 7.130600 2016-05-17 15:00:00 2016 5
308 Sunday Times http://www.sundaytimes.lk/article/1022689/two-... 5km landslide rain medium urban 9.0 NaN NaN NaN NaN NaN NaN NaN 80.606877 6.093508 2017-05-25 16:59:00 2017 5
9432 Hiru News http://www.hirunews.lk/84498/update-9-killed-d... 5km landslide rain medium unknown 7.0 0.0 NaN Sri Lanka Western 7500.0 Horawala Junction 5.95287 80.156300 6.526100 2014-06-01 23:00:00 2014 6
3876 News.com.au http://www.news.com.au/world/breaking-news/sri... 1km landslide downpour medium unknown 7.0 0.0 NaN Sri Lanka Central 3545.0 Talawakele 11.85620 80.591300 7.021000 2015-09-25 00:00:00 2015 9
4511 World Socialist Website https://www.wsws.org/en/articles/2015/10/02/es... 1km landslide downpour medium unknown 7.0 1.0 NaN Sri Lanka Central 3545.0 Talawakele 13.52031 80.693000 7.054300 2015-09-25 14:30:00 2015 9
4989 lankaeverything http://www.lankaeverything.com/vinews/srilanka... 50km landslide tropical_cyclone medium unknown 6.0 NaN NaN Sri Lanka Western 8982.0 Horana South 12.63751 80.174400 6.739700 2008-04-27 00:00:00 2008 4

Distribution Of Categories In Occurred Events In Sri Lanka

In [ ]:
def event_by_category_sl():

  #print value count
  print(srilanka_df['landslide_category'].value_counts()) 

  #print in countplot graph
  plt.figure(figsize=(18,8.5))
  sns.countplot(srilanka_df['landslide_category'],palette='YlGnBu_r')
  plt.xticks(rotation='vertical',size=15)
  plt.title('Event Category Distribution In Sri Lanka',size=16)
  plt.xlabel('',size=10)
  plt.show()


event_by_category_sl()
landslide      74
rock_fall       4
mudslide        4
debris_flow     2
other           1
complex         1
Name: landslide_category, dtype: int64

Distribution Of Triger Type In Occurred Events In Sri Lanka

In [ ]:
def event_by_triger_sl():

  #print value count
  print(srilanka_df['landslide_trigger'].value_counts()) 

  #print in countplot graph
  plt.figure(figsize=(18,8.5))
  sns.countplot(srilanka_df['landslide_trigger'],palette='crest')
  plt.xticks(rotation='vertical',size=15)
  plt.title('Event By Triger Type In Sri Lanka',size=16)
  plt.xlabel('',size=10)
  plt.show()


event_by_triger_sl()
downpour            43
rain                15
continuous_rain     10
monsoon              9
tropical_cyclone     4
construction         3
unknown              2
Name: landslide_trigger, dtype: int64
In [ ]:
def event_by_size_sl():

  #print value count
  print(srilanka_df['landslide_size'].value_counts()) 

  #print in countplot graph
  plt.figure(figsize=(18,8.5))
  sns.countplot(srilanka_df['landslide_size'],palette='Set2_r')
  plt.xticks(rotation='vertical',size=15)
  plt.title('By Size In Sri Lanka',size=16)
  plt.xlabel('',size=10)
  plt.show()


event_by_size_sl()
medium        73
large          6
small          6
very_large     1
Name: landslide_size, dtype: int64

Event Reported Sources About In Sri Lanka

In [ ]:
Reported_source_sl = pd.DataFrame(srilanka_df['source_name'].value_counts().head(15)).reset_index()
Reported_source_sl.columns = ['Source Name','Reported Count']
Reported_source_sl
Out[ ]:
Source Name Reported Count
0 print.dailymirror 15
1 News 1st 4
2 dailymirror 4
3 Hiru News 4
4 Daily Mirror 3
5 sundaytimes 3
6 colombopage 3
7 Colombopage.com 2
8 Sunday Times 2
9 Daily News 2
10 reliefweb 2
11 www.dailynews.lk 2
12 dailynews 2
13 News Radio.lk 2
14 sundaytimes.lk 2
In [ ]:
# visualize source reported times About in Sri Lanka

plt.figure(figsize=(18,12))
sns.barplot(x="Reported Count", y="Source Name", 
            data=Reported_source_sl,
            palette="gist_earth")

plt.xticks(size=12)
plt.title('Reported Sources By Number Of Reported Times About In Sri Lanka',size=16)
plt.xlabel('Reported Times',size=10)
plt.show()

Print.Daily Mirror Has Did Good Job When Reporting Events

In [ ]:
# export notbook file as HTML file

'''
%%shell
jupyter nbconvert --to html /content/Accomondations_In_Sri_Lanka.ipynb

'''